Search CORE

11 research outputs found

Visualising Basins of Attraction for the Cross-Entropy and the Squared Error Neural Network Loss Functions

Author: Bosman Anna Sergeevna
Engelbrecht Andries
Helbig Mardé
Publication venue
Publication date: 09/01/2019
Field of study

Quantification of the stationary points and the associated basins of attraction of neural network loss surfaces is an important step towards a better understanding of neural network loss surfaces at large. This work proposes a novel method to visualise basins of attraction together with the associated stationary points via gradient-based random sampling. The proposed technique is used to perform an empirical study of the loss surfaces generated by two different error metrics: quadratic loss and entropic loss. The empirical observations confirm the theoretical hypothesis regarding the nature of neural network attraction basins. Entropic loss is shown to exhibit stronger gradients and fewer stationary points than quadratic loss, indicating that entropic loss has a more searchable landscape. Quadratic loss is shown to be more resilient to overfitting than entropic loss. Both losses are shown to exhibit local minima, but the number of local minima is shown to decrease with an increase in dimensionality. Thus, the proposed visualisation technique successfully captures the local minima properties exhibited by the neural network loss surfaces, and can be used for the purpose of fitness landscape analysis of neural networks.Comment: Preprint submitted to the Neural Networks journa

arXiv.org e-Print Archive

UPSpace at the University of Pretoria

Fitness Landscape Analysis of Feed-Forward Neural Networks

Author: Bosman Anna Sergeevna
Publication venue: 'University of Pretoria - Department of Philosophy'
Publication date: 01/01/2019
Field of study

Neural network training is a highly non-convex optimisation problem with poorly understood properties. Due to the inherent high dimensionality, neural network search spaces cannot be intuitively visualised, thus other means to establish search space properties have to be employed. Fitness landscape analysis encompasses a selection of techniques designed to estimate the properties of a search landscape associated with an optimisation problem. Applied to neural network training, fitness landscape analysis can be used to establish a link between the properties of the error landscape and various neural network hyperparameters. This study applies fitness landscape analysis to investigate the influence of the search space boundaries, regularisation parameters, loss functions, activation functions, and feed-forward neural network architectures on the properties of the resulting error landscape. A novel gradient-based sampling technique is proposed, together with a novel method to quantify and visualise stationary points and the associated basins of attraction in neural network error landscapes.Thesis (PhD)--University of Pretoria, 2019.NRFComputer SciencePhDUnrestricte

UPSpace at the University of Pretoria

Black-Box Saliency Map Generation Using Bayesian Optimisation

Author: Burke Michael
Mokuwe Mamuku
Sergeevna Bosman Anna
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 30/01/2020
Field of study

Saliency maps are often used in computer vision to provide intuitive interpretations of what input regions a model has used to produce a specific prediction. A number of approaches to saliency map generation are available, but most require access to model parameters. This work proposes an approach for saliency map generation for black-box models, where no access to model parameters is available, using a Bayesian optimisation sampling method. The approach aims to find the global salient image region responsible for a particular (black-box) model's prediction. This is achieved by a sampling-based approach to model perturbations that seeks to localise salient regions of an image to the black-box model. Results show that the proposed approach to saliency map generation outperforms grid-based perturbation approaches, and performs similarly to gradient-based approaches which require access to model parameters.Comment: Submitted to IJCNN 202

arXiv.org e-Print Archive

Crossref

Edinburgh Research Explorer

Genetic Micro-Programs for Automated Software Testing with Large Path Coverage

Author: Bosman Anna Sergeevna
Goschen Jarrod
Gruner Stefan
Publication venue
Publication date: 14/02/2023
Field of study

Ongoing progress in computational intelligence (CI) has led to an increased desire to apply CI techniques for the purpose of improving software engineering processes, particularly software testing. Existing state-of-the-art automated software testing techniques focus on utilising search algorithms to discover input values that achieve high execution path coverage. These algorithms are trained on the same code that they intend to test, requiring instrumentation and lengthy search times to test each software component. This paper outlines a novel genetic programming framework, where the evolved solutions are not input values, but micro-programs that can repeatedly generate input values to efficiently explore a software component's input parameter domain. We also argue that our approach can be generalised such as to be applied to many different software systems, and is thus not specific to merely the particular software component on which it was trained.Comment: A version of this paper has been accepted for publication in CEC'2

arXiv.org e-Print Archive

Empirical Loss Landscape Analysis of Neural Network Activation Functions

Author: Bosman Anna Sergeevna
Engelbrecht Andries
Helbig Marde
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 28/06/2023
Field of study

Activation functions play a significant role in neural network design by enabling non-linearity. The choice of activation function was previously shown to influence the properties of the resulting loss landscape. Understanding the relationship between activation functions and loss landscape properties is important for neural architecture and training algorithm design. This study empirically investigates neural network loss landscapes associated with hyperbolic tangent, rectified linear unit, and exponential linear unit activation functions. Rectified linear unit is shown to yield the most convex loss landscape, and exponential linear unit is shown to yield the least flat loss landscape, and to exhibit superior generalisation performance. The presence of wide and narrow valleys in the loss landscape is established for all activation functions, and the narrow valleys are shown to correlate with saturated neurons and implicitly regularised network configurations.Comment: Accepted for publication in Genetic and Evolutionary Computation Conference Companion, July 15--19, 2023, Lisbon, Portuga

arXiv.org e-Print Archive

Cauchy Loss Function: Robustness Under Gaussian and Cauchy Noise

Author: Bosman Anna Sergeevna
Mlotshwa Thamsanqa
van Deventer Heinrich
Publication venue
Publication date: 14/02/2023
Field of study

In supervised machine learning, the choice of loss function implicitly assumes a particular noise distribution over the data. For example, the frequently used mean squared error (MSE) loss assumes a Gaussian noise distribution. The choice of loss function during training and testing affects the performance of artificial neural networks (ANNs). It is known that MSE may yield substandard performance in the presence of outliers. The Cauchy loss function (CLF) assumes a Cauchy noise distribution, and is therefore potentially better suited for data with outliers. This papers aims to determine the extent of robustness and generalisability of the CLF as compared to MSE. CLF and MSE are assessed on a few handcrafted regression problems, and a real-world regression problem with artificially simulated outliers, in the context of ANN training. CLF yielded results that were either comparable to or better than the results yielded by MSE, with a few notable exceptions.Comment: A version of this paper was accepted for publication in SACAIR'2

arXiv.org e-Print Archive

Comparision Of Adversarial And Non-Adversarial LSTM Music Generative Models

Author: Bosman Anna Sergeevna
De Villiers Johan Pieter
Mots'oehli Moseli
Publication venue
Publication date: 01/11/2022
Field of study

Algorithmic music composition is a way of composing musical pieces with minimal to no human intervention. While recurrent neural networks are traditionally applied to many sequence-to-sequence prediction tasks, including successful implementations of music composition, their standard supervised learning approach based on input-to-output mapping leads to a lack of note variety. These models can therefore be seen as potentially unsuitable for tasks such as music generation. Generative adversarial networks learn the generative distribution of data and lead to varied samples. This work implements and compares adversarial and non-adversarial training of recurrent neural network music composers on MIDI data. The resulting music samples are evaluated by human listeners, their preferences recorded. The evaluation indicates that adversarial training produces more aesthetically pleasing music.Comment: Submitted to a 2023 conference, 20 pages, 13 figure

arXiv.org e-Print Archive